KENDALLTAU
Overview
The KENDALLTAU function calculates Kendall’s tau, a non-parametric measure of correlation for ordinal data. Unlike Pearson correlation, which measures linear relationships, Kendall’s tau assesses the strength of association between two rankings or ordinal variables by counting the number of concordant and discordant pairs of observations.
Kendall’s tau was introduced by Maurice G. Kendall in 1938 as “A New Measure of Rank Correlation” (Biometrika, Vol. 30). The statistic ranges from -1 to +1, where values close to 1 indicate strong agreement between rankings, values close to -1 indicate strong disagreement, and values near 0 suggest no association.
This implementation uses the SciPy library’s scipy.stats.kendalltau function and supports two variants:
- Tau-b (default): Adjusts for tied ranks and is suitable when both variables may contain ties. It is computed as:
\tau_b = \frac{P - Q}{\sqrt{(P + Q + T)(P + Q + U)}}
- Tau-c (Stuart’s tau-c): A variant normalized for rectangular tables, computed as:
\tau_c = \frac{2(P - Q)}{n^2 (m - 1) / m}
In these formulas, P is the number of concordant pairs, Q is the number of discordant pairs, T is the number of ties only in x, U is the number of ties only in y, n is the sample size, and m is the smaller of the number of unique values in x or y. Both variants reduce to Kendall’s original tau-a when no ties are present.
The function also returns a p-value for testing the null hypothesis that there is no association between the two variables (\tau = 0). For more information on Kendall’s tau, see the SciPy documentation and the Kendall rank correlation coefficient article on Wikipedia.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=KENDALLTAU(x, y, kendalltau_variant)
x(list[list], required): Array of rankings or observations. Must be the same length as y.y(list[list], required): Array of rankings or observations. Must be the same length as x.kendalltau_variant(str, optional, default: “b”): Defines which variant of Kendall’s tau is returned.
Returns (list[list]): 2D list [[tau, p_value]], or error message string.
Examples
Example 1: Kendall’s tau-b with tied values
Inputs:
| x | y | kendalltau_variant |
|---|---|---|
| 12 | 1 | b |
| 2 | 4 | |
| 1 | 7 | |
| 12 | 1 | |
| 2 | 0 |
Excel formula:
=KENDALLTAU({12;2;1;12;2}, {1;4;7;1;0}, "b")
Expected output:
| Result | |
|---|---|
| -0.4714 | 0.2827 |
Example 2: Kendall’s tau-c with tied values
Inputs:
| x | y | kendalltau_variant |
|---|---|---|
| 12 | 1 | c |
| 2 | 4 | |
| 1 | 7 | |
| 12 | 1 | |
| 2 | 0 |
Excel formula:
=KENDALLTAU({12;2;1;12;2}, {1;4;7;1;0}, "c")
Expected output:
| Result | |
|---|---|
| -0.48 | 0.2827 |
Example 3: Perfect positive correlation
Inputs:
| x | y | kendalltau_variant |
|---|---|---|
| 1 | 1 | b |
| 2 | 2 | |
| 3 | 3 | |
| 4 | 4 |
Excel formula:
=KENDALLTAU({1;2;3;4}, {1;2;3;4}, "b")
Expected output:
| Result | |
|---|---|
| 1 | 0.0833 |
Example 4: Perfect negative correlation
Inputs:
| x | y | kendalltau_variant |
|---|---|---|
| 1 | 4 | b |
| 2 | 3 | |
| 3 | 2 | |
| 4 | 1 |
Excel formula:
=KENDALLTAU({1;2;3;4}, {4;3;2;1}, "b")
Expected output:
| Result | |
|---|---|
| -1 | 0.0833 |
Python Code
from scipy.stats import kendalltau as scipy_kendalltau
def kendalltau(x, y, kendalltau_variant='b'):
"""
Calculate Kendall's tau, a correlation measure for ordinal data.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kendalltau.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Array of rankings or observations. Must be the same length as y.
y (list[list]): Array of rankings or observations. Must be the same length as x.
kendalltau_variant (str, optional): Defines which variant of Kendall's tau is returned. Valid options: Tau-b, Tau-c. Default is 'b'.
Returns:
list[list]: 2D list [[tau, p_value]], or error message string.
"""
def to2d(val):
return [[val]] if not isinstance(val, list) else val
def flatten(arr):
flat = []
for row in arr:
if isinstance(row, list):
flat.extend(row)
else:
flat.append(row)
return flat
x = to2d(x)
y = to2d(y)
try:
x_array = [float(val) for val in flatten(x)]
y_array = [float(val) for val in flatten(y)]
except (ValueError, TypeError):
return "Invalid input: x and y must contain numeric values."
if len(x_array) != len(y_array):
return "Invalid input: x and y must have the same length."
if len(x_array) < 2:
return "Invalid input: arrays must contain at least 2 elements."
if kendalltau_variant not in ["b", "c"]:
return "Invalid input: kendalltau_variant must be 'b' or 'c'."
try:
result = scipy_kendalltau(x_array, y_array, variant=kendalltau_variant)
return [[float(result.statistic), float(result.pvalue)]]
except Exception as e:
return f"scipy.stats.kendalltau error: {e}"